Path Integral Policy Improvement with Covariance Matrix Adaptation
نویسندگان
چکیده
There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. PI is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider PI as a member of the wider family of methods which share the concept of probability-weighted averaging to iteratively update parameters to optimize a cost function. We compare PI to other members of the same family – Cross-Entropy Methods and CMAES – at the conceptual level and in terms of performance. The comparison suggests the derivation of a novel algorithm which we call PI-CMA for “Path Integral Policy Improvement with Covariance Matrix Adaptation”. PI-CMA’s main advantage is that it determines the magnitude of the exploration noise automatically.
منابع مشابه
Adaptation de la matrice de covariance pour l'apprentissage par renforcement direct
Résumé : La résolution de problèmes à états et actions continus par l’optimisation de politiques paramétriques est un sujet d’intérêt récent en apprentissage par renforcement. L’algorithme PI est un exemple de cette approche, qui bénéficie de fondements mathématiques solides tirés de la commande stochastique optimale et des outils de la théorie de l’estimation statistique. Dans cet article, nou...
متن کاملAdaptive exploration through covariance matrix adaptation enables developmental motor learning
2 FLOWERS Team INRIA Bordeaux Sud-Ouest Talence, France Abstract The “Policy Improvement with Path Integrals” (PI2) [25] and “Covariance Matrix Adaptation Evolutionary Strategy” [8] are considered to be state-of-the-art in direct reinforcement learning and stochastic optimization respectively. We have recently shown that incorporating covariance matrix adaptation into PI2– which yields the PICM...
متن کاملCovariance Matrix Estimation for Reinforcement Learning
One of the goals in scaling reinforcement learning (RL) pertains to dealing with high-dimensional and continuous stateaction spaces. In order to tackle this problem, recent efforts have focused on harnessing well-developed methodologies from statistical learning, estimation theory and empirical inference. A key related challenge is tuning the many parameters and efficiently addressing numerical...
متن کاملTask Scheduling Algorithm Using Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in Cloud Computing
The cloud computing is considered as a computational model which provides the uses requests with resources upon any demand and needs.The need for planning the scheduling of the user's jobs has emerged as an important challenge in the field of cloud computing. It is mainly due to several reasons, including ever-increasing advancements of information technology and an increase of applications and...
متن کاملWhat Does the Evolution Path Learn in CMA-ES?
The Covariance matrix adaptation evolution strategy (CMA-ES) evolves a multivariate Gaussian distribution for continuous optimization. The evolution path, which accumulates historical search direction in successive generations, plays a crucial role in the adaptation of covariance matrix. In this paper, we investigate what the evolution path approximates in the optimization procedure. We show th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1206.4621 شماره
صفحات -
تاریخ انتشار 2012